home *** CD-ROM | disk | FTP | other *** search
- *****************************************************************
-
- Voice Recognition for the Amiga using an audio digitizer.
-
- Voice.library (Ver 6.6) by Richard Horne - February 1993
-
- *****************************************************************
-
- FUNCTION OFFSET DEFINITIONS
-
- _LVOLearn EQU -30
- _LVORecognize EQU -36
- _LVOAddVoiceTask EQU -42
- _LVORemVoiceTask EQU -48
- _LVOGainUp EQU -54
- _LVOGainDown EQU -60
- _LVORecDataAddress EQU -66
- _LVORecMapAddress EQU -72
- _LVOWordScore EQU -78
- _LVOPickSampler EQU -84
- _LVOSetVoicePri EQU -90
- _LVOPickTimer EQU -96
-
- ************************* FUNCTION DEFINITIONS ******************
-
- >>>>> All variables are long words unless otherwise noted. <<<<<<
-
- NOTE: Voice.library is opened with a call to the exec OpenLibrary
- function. OpenLibrary can fail for one of three reasons:
-
- 1. The voice.library file is not available in the libs: directory or
- cannot be found.
-
- 2. The parallel port is busy.
-
- 3. Voice.library is currently opened and being used by another
- application program.
-
- *****************************************************************
-
- NAME:
- Learn -- Learn a spoken phrase.
-
- OFFSET:
- -30
-
- SYNOPSIS:
- MapAddress = Learn (MapBuffer, Text, Screen, SequenceNum, X, Y)
- d0 a0 a1 a2 d0 d1 d2
-
- FUNCTION:
- The "Learn" function stores a frequency map of a spoken word or
- phrase. Each frequency map is made up of 72 long words of data
- plus a 16 byte header for the associated ASCII text (304 bytes
- total). "Learn" requires the user to reserve a MapBuffer in
- memory equal to the size of vocabulary desired (number of words)
- times 304 bytes. MapBuffer address is passed to "Learn" in a0.
- Address of a null terminated text string representing the word or
- phrase to be learned is passed to "Learn" in a1.
-
- The "Learn" function will open it's own window on the screen
- specified in a2 (use NULL for WBENCHSCREEN), at a position X, Y
- specified in d1 and d2. The user will then be prompted to speak
- the specified word or phrase to obtain three good digital
- samples. Internally, these three samples are analyzed for
- frequency content and transformed into a frequency map (304
- bytes) which is stored in the MapBuffer according to the Sequence
- Number specified in d0. "Learn" returns the memory address
- within MapBuffer at which this particular frequency map is
- stored. If "Learn" is intentionally cancelled using the close
- gadget of the Learn Window, then a zero will be returned.
-
- "Learn" is called separately for each word or phrase in the
- vocabulary. After every word has been learned, MapBuffer will be
- filled with a sequence of frequency maps (each 304 bytes). Then
- the "Recognize" or "AddVoiceTask" functions can be called which
- will listen to the audio digitizer, compute a frequency map
- of incoming words compare them to the words in MapBuffer, and
- indicate by Sequence Number which word or phrase is the best
- match. The maximum number of words or phrases in the vocabulary
- is 64.
-
- Note that you must select an audio sampler (PerfectSound3,
- SoundMaster, or Generic) using the "PickSampler" function before
- using the "Learn" function.
-
- *****************************************************************
-
- NAME:
- Recognize -- Recognize a spoken word or phrase.
-
- OFFSET:
- -36
-
- SYNOPSIS:
- SequenceNum = Recognize (MapBuffer, SizeVocabulary, Resolution)
- d0 a0 d0 d1
-
- FUNCTION:
- "Recognize" assumes that the user has learned a sequence of words
- or phrases using the "Learn" function. MapBuffer contains a
- sequence of frequency maps produced by "Learn" corresponding to
- each word or phrase in the vocabulary. Mapbuffer address is
- passed to "Recognize" in a0. Number of words or phrases in the
- vocabulary are passed to "Recognize" in d0.
-
- "Recognize" listens for an incoming word, computes it's frequency
- map, and compares this map to the sequence of maps contained in
- MapBuffer. The Sequence Number of the word or phrase in
- MapBuffer which is most similar to that of the incoming word is
- returned in d0. Note that the number "0" represents the first
- word, "1" the second, and so on.
-
- "Recognize" will operate at either high resolution (d1 = 0) or
- low resolution (d1 = 1). High resolution computes a frequency
- analysis of the incoming word or phrase at twice the number of
- points in time as low resolution. High resolution is somewhat
- better at word recognition, but takes almost twice the processing
- time.
-
- "Recognize" will return the following error codes if it cannot
- find a match.
-
- d0 = -1 if there is no match between the incoming frequency map
- and any of the maps in MapBuffer.
-
- d0 = -2 if the incoming word causes unacceptable digital
- clipping. Volume should be reduced by moving your
- microphone or by using the "GainDown" function.
-
- d0 = -3 if incoming word is too low in volume. Volume should be
- increased by moving your microphone or by using the "GainUp"
- function.
-
- d0 = -4 if the incoming sample is confused by extraneous noise.
-
- *****************************************************************
-
- NAME:
- AddVoiceTask -- Initiate a separate task to recognize a spoken
- word or phrase.
-
- OFFSET:
- -42
-
- SYNOPSIS:
- AddVoiceTask (MapBuffer, MsgPort, SizeVocabulary, Resolution)
- a0 a1 d0 d1
-
- FUNCTION:
- "AddVoiceTask" is similar in function to "Recognize" except that
- here, a separate task is started under the Amiga multitasking
- operating system which listens for incoming words or phrases and
- returns messages to the user's Message Port indicating the
- Sequence Number of the frequency map in Mapbuffer which best
- matches the frequency map of the incoming word. MapBuffer
- address and Message Port address are passed to "AddVoiceTask"
- in a0 and a1. Number of words or phrases in the vocabulary are
- passed to "AddVoiceTask" in d0.
-
- "AddVoiceTask" will operate at either high resolution (d1 = 0) or
- low resolution (d1 = 1). High resolution computes a frequency
- analysis of the incoming word or phrase at twice the number of
- points in time as low resolution. High resolution is somewhat
- better at word recognition, but takes almost twice the processing
- time.
-
- The messages sent to MessagePort are designed to mimic shortened
- IDCMP messages with a im_Class = $0. Thus you can receive and
- process these messages at either an Intuition window IDCMP
- message port or at a custom message port of your own.
- Messages sent by this task are as follows.
-
- im_Code = Sequence number of frequency map in MapBuffer that
- best matches the frequency map of the incoming
- word or phrase.
-
- im_Code = -1 if there is no match between the incoming
- frequency map and any of the maps in MapBuffer.
-
- im_Code = -2 if the incoming word causes unacceptable
- digital clipping. Volume should be reduced by
- moving your microphone or by using the "GainDown"
- function.
-
- im_Code = -3 if incoming word is too low in volume. Volume
- should be increased by moving your microphone or
- by using the "GainUp" function.
-
- im_Code = -4 if the incoming sample is confused by
- extraneous noise.
-
- Upon calling "AddVoiceTask", the PerfectSound digitizer becomes
- immediately active, listening for an incoming word. After
- receipt of a word or phrase, a message as described above is sent
- to Message Port. The VoiceTask then goes into a WAIT mode and
- remains inactive until it receives a reply to the message it has
- sent to Message Port. Upon receipt of a reply, VoiceTask again
- becomes goes active and listens for an incoming word. The priority
- of this task will be 127 for fastest possible voice recognition.
- You may change this priority to a lower value with the "SetVoicePri"
- task.
-
- *****************************************************************
-
- NAME:
- RemVoiceTask -- Remove task initiated by AddVoiceTask
-
- OFFSET:
- -48
-
- SYNOPSIS:
- RemVoiceTask ()
-
- FUNCTION:
- Deallocates memory and removes VoiceTask from the Amiga system.
- Note that the Message Port specified for the "AddVoiceTask" function
- must still exist at the time you call "RemVoiceTask". Also you
- must reply to all outstanding messages from VoiceTask BEFORE calling
- this function.
-
- *****************************************************************
-
- NAME:
- GainUp -- Increase gain of PerfectSound 3 audio digitizer.
-
- OFFSET:
- -54
-
- SYNOPSIS:
- GainUp()
-
- FUNCTION:
- Increases gain of the PerfectSound audio digitizer by one step.
- Note that when gain reaches maximum, "GainUp" will wrap around
- and return gain to it's lowest value. Do not call this function
- if you are using the SoundMaster audio digitizer.
-
- *****************************************************************
-
- NAME:
- GainDown -- Decease gain of PerfectSound 3 audio digitizer.
-
- OFFSET:
- -60
-
- SYNOPSIS:
- GainDown()
-
- FUNCTION:
- Decreases gain of the PerfectSound audio digitizer by one step.
- Note that when gain reaches minimum, "GainDown" will wrap around
- and return gain to it's highest value. Do not call this function
- if you are using the SoundMaster audio digitizer
-
- *****************************************************************
-
- NAME:
- RecDataAddress -- Return memory address of digital sample of
- incoming word or phrase.
-
- OFFSET:
- -66
-
- SYNOPSIS:
- Address = RecDataAddress()
- d0
-
-
- FUNCTION:
- When an incoming word or phrase is digitized, 3/4 second of
- digital data is stored in an internal buffer. This is 8 bit
- digitized data is sampled at a rate of 6400 Hz. Thus the buffer
- for storing this data is 4800 bytes in size. This function
- returns the address of this buffer for possible additional
- experimental uses.
-
- *****************************************************************
-
- NAME:
- RecMapAddress -- Return memory address of frequency map of
- incoming word or phrase.
-
- OFFSET:
- -72
-
- SYNOPSIS:
- Address = RecMapAddress()
- d0
-
- FUNCTION:
- A frequency map of each incoming word or phrase is computed for
- comparison with maps learned and stored in MapBuffer. Each map
- consists of a frequency analysis of 3/4 second of audio data at
- 72 points in time. For each of these 72 time points, the data is
- examined for frequency content at 32 points between 0 Hz and 3200
- Hz. A frequency map is made up of 72, 32 bit words corresponding
- to the 72 time points analyzed. For each of these 32 bit words,
- bit 0 is set if the signal contains frequency components from
- 0-100 Hz. Bit 1 is set if the signal contains frequency
- components from 100-200 Hz. Bit 2 is set if the signal contains
- frequency components from 200-300 Hz etc. This function returns
- the address of this frequency map for possible additional
- experimental uses. Note that this internal frequency map does
- not have the 16 byte ASCII header as do the frequency maps
- stored in MapBuffer.
-
- *****************************************************************
-
- NAME:
- WordScore -- Return recognition score of a recognized word.
-
- OFFSET:
- -78
-
- SYNOPSIS:
- Value = WordScore()
- d0
-
- FUNCTION:
- The "Recognize" function computes a numerical score representing the
- "goodness" of a match between the frequency map of an incoming word
- and each frequency map stored in MapBuffer. The recognized word
- is determined by highest score. This function returns the score value
- for the recognized word. Internally, a score of #2000 must be achieved
- in order for a match to be declared. If you wish to have a higher match
- score threshold to reduce false matches, you may call "WordScore" after
- each word is recognized and set your own higher score threshold before
- accepting a match. Increasing the match score threshold will reduce
- false matches, but will also decrease recognition performance.
-
-
- *****************************************************************
-
- NAME:
- PickSampler -- Specify which model audio sampler to use (either
- PerfectSound3, SoundMaster, or Generic).
-
- OFFSET:
- -84
-
- SYNOPSIS:
- PickSampler (SamplerID)
- d0
-
- FUNCTION:
- Select the audio sampler to be used with this function. SamplerID = 0
- for PerfectSound3. SamplerID = 1 for SoundMaster. SamplerID = 2 for
- Generic Sampler.
-
- *****************************************************************
-
- NAME:
- SetVoicePri -- Set the multitasking priority of a voice recognition
- task that has been started by the "AddVoiceTask"
- function.
-
- OFFSET:
- -90
-
- SYNOPSIS:
- Old Priority = SetVoicePri (New Priority)
- d0 d0
-
- FUNCTION:
- When "AddVoiceTask" is called, a voice recgnition task of priority 127
- is started for the fastest possible voice recognition. You may modify
- this priority by setting New Priority to any value between -128 and 127
- and calling "SetVoicePri" which changes task priority to the new value
- and returns the value of the old task priority. "AddVoiceTask" must
- be called before "SetVoicePri."
-
- *****************************************************************
-
- NAME:
- PickTimer -- Select either Timer A or Timer B of the CIA B for use
- in timing digital audio samples.
-
- OFFSET:
- -96
-
- SYNOPSIS:
- PickTimer(TimerID)
- d0
-
- FUNCTION:
- Voice.library uses CIA B Timer B by default for setting the time interval
- between digital audio samples. You may find situations where other
- applications require Timer B, causing a conflict. Use this function to
- choose either Timer B or Timer A as required. TimerID = 0 for selection
- of Timer B. TimerID = 1 for selection of Timer A.
-
-
-